Penguin
This Quarto document serves as a practical illustration of the concepts covered in the productive workflow online course
1 Introduction
This document offers a straightforward analysis of the well-known penguin dataset. It is designed to complement the Productive R Workflow online course.
You can read more about the penguin dataset here.
Let’s load libraries before we start!
2 Loading data
The dataset has already been loaded and cleaned in the previous step of this pipeline.
Let’s load the clean version, together with a few functions available in functions.R.
3 Bill Length and Bill Depth
Now, let’s make some descriptive analysis, including summary statistics and graphs.
What’s striking is the slightly negative relationship between bill length and bill depth:
\[{\displaystyle Avg={\frac {1}{n}}\sum _{i=1}^{n}a_{i}={\frac {a_{1}+a_{2}+\cdots +a_{n}}{n}}}\]
Show the code
library(hrbrthemes)
palmerpenguins::penguins |>
filter(!is.na(sex)) |>
ggplot(
aes(x = bill_length_mm, y = bill_depth_mm)
) +
geom_point(color = "#69b3a2") +
labs(
x = "Bill Length (mm)",
y = "Bill Depth (mm)",
title = paste("Surprising relationship?")
) +
theme_ipsum()It is also interesting to note that bill length a and bill depth are quite different from one specie to another. This is summarized in the 2 tables below:
Show the code
# A tibble: 3 × 2
species average_bill_length
<chr> <dbl>
1 Adelie 38.8
2 Chinstrap 48.8
3 Gentoo 47.5
# A tibble: 3 × 2
species average_bill_depth
<chr> <dbl>
1 Adelie 18.3
2 Chinstrap 18.4
3 Gentoo 15.0
Now, let’s check the relationship between bill depth and bill length for the specie Adelie on the island Torgersen:
Show the code
# Use the function in functions.R
p1 <- create_scatterplot(data, "Adelie", "Torgersen")
p2 <- create_scatterplot(data, "Chinstrap", "Biscoe")
p3 <- create_scatterplot(data, "Gentoo", "Dream")
(p1 + p2) / p34 Displaying penguins data as a DT table
Show the code
Making scatterplot interactive using the library ggplotly.
Show the code
library(tidyverse)
library(plotly)
library(hrbrthemes)
penguins <- palmerpenguins::penguins |>
filter(!is.na(sex)) |>
ggplot(
aes(x = bill_length_mm, y = bill_depth_mm)
) +
geom_point(color = "#69b3a2") +
labs(
x = "Bill Length (mm)",
y = "Bill Depth (mm)",
title = paste("Surprising relationship?")
) +
theme_ipsum()
ggplotly(penguins)